Aperture Neuro — Latest Matching Preprints

1

PIE Toolbox: SSM-PCA Based Software for PET Diagnostic Pattern Analysis

Romanov, M.; Kireev, M.; Didur, M.; Cherednichenko, D.; Korotkov, A.; Valdes-Sosa, P.; Fan, Q.; Wang, Q.

2026-06-01 radiology and imaging 10.64898/2026.05.28.26354341 medRxiv

Top 0.3%

0.9%

Show abstract

One of the prominent methods in neuroimaging data processing is SSM-PCA, which is based on principal component analysis and allows for the identification of diagnostically significant patterns in the form of statistical maps. We developed software, PIE Toolbox, employs SSM-PCA and classification based on the obtained diagnostic patterns revealed from functional and structural tomographic brain imaging. The program supports the entire analysis pipeline including preprocessing of brain images, diagnostic patterns extraction, building classification models, and prediction based on them. The resulting diagnostic patterns are weighted principal components obtained through SSM-PCA, or their linear combinations. PIE Toolbox allows selection of relevant structural and functional brain patterns, computation of their expression values in regions of interest, classification using support vector machines, and evaluation of model performance via cross-validation. This approach enables the use of patterns as features of intergroup differences for individual diagnosis. The software has been validated on both simulated and ADNI datasets.

2

TopBrain Segmentation Challenge for Whole Brain Vessel Anatomy

Yang, K.; Shi, P.; Huang, H.; Musio, F.; Baazaoui, H.; Aydin, O. U.; Hilbert, A.; Hamadache, R. E.; Yalcin, C.; Zhang, M.; Falcetta, D.; de la Rosa, E.; Shit, S.; Prabhakar, C.; Wittmann, B.; Rokuss, M. R.; Kirchhoff, Y.; Al-Maskari, R.; Hoeher, L.; Juchler, N.; Casamitjana, A.; Cleary, J.; Schmick, A.; Baumgartner, P.; Deseoe, J.; Vandans, O.; Lee, D.; Oh, K.; LaBella, D.; Mazher, M.; Niederer, S. A.; Qayyum, A.; Liu, Y.; Chen, J.; Kim, W.; Asawalertsak, N.; Kim, M.; Shin, D.; Park, S.-H.; Kikuchi, S.; Zhang, Y.; Liu, J.; Cui, Y.; Qiu, Y.; Verschuur, A.; Zhang, J.; van der Schaaf, I.; Su, R.;

2026-05-30 radiology and imaging 10.64898/2026.05.28.26354312 medRxiv

Top 0.4%

0.7%

Show abstract

We present the TopBrain 2025 Challenge, the first benchmark for fine-grained multiclass segmentation of the whole brain vasculature in both computed tomography angiography (CTA) and magnetic resonance angiography (MRA). Building on the TopCoW challenge, TopBrain scales vessel annotation from the Circle of Willis to the entire brain, introducing a dataset of 90 annotated volumes across 48 landmark vessel classes spanning arterial and venous systems, of which 50 training volumes are publicly released. Vessel definitions were consolidated from established neuroanatomical references into a unified annotation scheme, and vessel caliber measurements along the centerline are reported for the first time across the whole brain vascular anatomy. To address the unique challenges of multiclass brain vessel segmentation, we propose an evaluation framework that accounts for detection in segmentation performance, assesses anatomical plausibility, and introduces novel contamination metrics that characterize inter-class prediction errors. Fifteen teams from over 220 registered participants submitted algorithms to the benchmark. The top-performing teams built on nnUNet with principled system design choices, achieving around 80% Dice scores, near-zero invalid neighbor counts, over 60% F1 scores for side-road vessels, and below 18% foreground contamination ratio. Larger vessels are easier to segment, while smaller and more complex vessels remain the true bottleneck. The annotated datasets and podium-finish algorithms are made publicly available on Zenodo.

3

Left Ventricular Volume and Function Assessment Using a Reduced-Slice Approach in Cardiovascular Magnetic Resonance

Tejaswi, A.; Fyrdahl, A.; Sigfridsson, A.

2026-06-01 cardiovascular medicine 10.64898/2026.05.29.26354413 medRxiv

Top 0.6%

0.3%

Show abstract

Background: Cardiovascular magnetic resonance (CMR) quantification of the left ventricular (LV) volumes and ejection fraction (EF) typically involves manual segmentation of many short axis (SAx) and long axis (LAx) slices of the left ventricle. The scan time and the number of breath holds is proportional to the number of slices. We aimed to evaluate a geometric model of the left ventricle that could enable planimetry from a reduced number of slices. We sought to determine whether acceptable accuracy was retained for evaluating the End Diastolic Volume (EDV), End Systolic Volume (ESV), Stroke Volume (SV), and EF to provide a rapid and reliable clinical alternative. Methods: A cohort of 342 patients, median age: 54 (40 - 65) years, with full-stack CMR examinations was used. Nine geometrical combinations were evaluated: 3, 4 or 5 short axis slices and one of three LAx orientations (2-chamber, 3-chamber or 4-chamber) by retrospectively decimating the full-stack acquisition. LV volumes were calculated as a sum of trapezoidal approximations for apical and mid-cavity slices and a generalized prismoidal model at the base. The accuracy of the volume calculations was quantified against the full-stack reference for the EDV, ESV, SV, and EF using concordance correlation coefficient (CCC), two-way repeated measures ANOVA, pairwise tests, and Bayes factor log10(BF10) analysis. Results: The choice of the long axis (LAx) view was the most influential driver of accuracy (g2 = 0.104, for EDV), approximately 50 times more impactful than the number of SAx slices (g2 = 0.002, for EDV). Volumes calculated using the combination of 2-chamber LAx view and 5 SAx slices had the highest concordance with the full stack (CCC>0.90). While the estimated absolute volumes displayed a systematic negative bias, EF and SV remained highly robust due to bias cancellation. For a 2ch + 5 SAx protocol, EF bias was just 0.83% (LoA: -6.18 to 7.84%), with a minimum detectable change (MDC) of 7.01%, compared to 8.7% reported for expert human readers, suggesting strong concordance. Bayesian paired-samples t-tests yielded log10(BF10) = 6.42 in favor of 5 SAx over 3 SAx, constituting decisive evidence on the Jeffreys scale. The bias and limits of agreement (LoA) for stroke volume and ejection fraction were found to be lower than scan-rescan reproducibility in literature. Conclusion: This reduced-slice geometric model allows for reduced number of breath holds compared to a conventional full-stack CMR acquisition and provides an acceptable accuracy with bias less than scan-rescan variability.

4

Quantifying the Optimism of Naive Cross-Validation for Binary Outcome Prediction with Repeated-Measures Predictors: A Simulation Study and Clinical Illustration

Hagan, J.

2026-05-29 epidemiology 10.64898/2026.05.27.26354222 medRxiv

Top 0.7%

0.3%

Show abstract

Background. Cross-validation (CV) is widely used to estimate predictive performance, but can overestimate performance when applied at the observation level to repeated-measures data. When continuous predictor variables are measured repeatedly within subjects and the binary outcome is defined at the subject level, naive observation-level CV introduces data leakage through within-subject dependence, producing optimistically biased estimates of the area under the receiver operating characteristic curve (AUROC). The magnitude of this bias and the performance of alternative partitioning strategies have not been formally characterized for this data structure. Methods. Three CV strategies were compared for estimating subject-level AUROC in ridge logistic regression models: naive observation-level 10-fold CV, subject-level 10-fold CV, and leave-one-cluster-out (LOCO) CV. The framework was applied to a motivating clinical dataset of daily oxygenation measures and retinopathy of prematurity outcomes among 101 extremely low birth weight infants. A factorial simulation study was conducted across 162 parameter combinations varying cluster count (20-150), intraclass correlation (0.1-0.5), within-cluster autocorrelation (0.2-0.8), and outcome prevalence (10-35%), with 500 simulated datasets per condition (76,389 valid datasets total). Results. In the motivating dataset, naive CV produced optimism of +0.078 AUROC units for severe ROP prediction (15 events, 101 subjects) and +0.031 for any ROP prediction (48 events). Subject-level 10-fold CV closely approximated LOCO (deviation [≤] 0.015). In the simulation, naive CV optimism ranged from +0.039 to +0.204 across all conditions, increasing monotonically with higher ICC, higher autocorrelation, fewer clusters, and lower event rates. Subject-level 10-fold CV was essentially unbiased relative to LOCO across all 162 conditions (mean absolute deviation = 0.002). Conclusions. Naive observation-level CV meaningfully overestimates discriminative performance in the repeated-measures binary outcome setting and should not be used. Subject-level CV partitioning effectively eliminates this bias. Accordingly, subject-level partitioning should be considered essential, not optional, when validating prediction models using repeated-measures data with subject-level outcomes.

5

Voxel-wise temporal decomposition of hypoxia-targeted BOLD MRI: method development and proof-of-concept application in glioblastoma

Schmidlechner, T.; Stumpo, V.; Jehli, E.; Zerweck, L.; Bellomo, J.; Gönel, M.; Müller, F.; Sebök, M.; Bink, A.; Kulcsar, Z.; Weller, M.; Regli, L.; Fierstra, J.; van Niftrik, C. H. B.

2026-05-29 radiology and imaging 10.64898/2026.05.27.26354265 medRxiv

Top 0.9%

0.2%

Show abstract

Hypoxia-targeted BOLD MRI is a novel technique, which probes oxygenation physiology in response to a controlled transient hypoxia stimulus. In glioblastoma, the signal response is spatially and temporally heterogeneous. We developed a voxel-wise temporal decomposition framework for hypoxia-targeted BOLD MRI that separates the arrival of responses, transition phases, and steady state during controlled isocapnic hypoxia. Twenty healthy controls underwent 3-T BOLD MRI during a double hypoxic step challenge to establish a normative reference. Three patients with newly diagnosed glioblastoma were included as proof-of-concept cases. For each voxel, we estimated response arrival delay (Delaycorr), delay to plateau, delay to return and an O2-normalized steady-state response (HypoxiaSS). Healthy-control maps were used to construct a voxel-wise normative atlas and, for HypoxiaSS, a global-response-adjusted model for patient deviation mapping. In healthy controls, HypoxiaSS showed lower supratentorial between-subject variabilitythan both whole-stimulus comparators (coefficient of variation: 1.77 versus 2.36 for Hypoxiaavg) and higher voxel-level step-to-step agreement (ICC(2,1): median 0.951 versus 0.792 for Hypoxiaavg). Whole-stimulus averaging exhibited a systematic step-2 signal amplification present in 19 of 20 subjects, which was absent from HypoxiaSS. Asingle global response scalar explained a median 72.5% of voxel-wise between-subject variance in HypoxiaSS. In proof-of-concept patient analyses, G-adjusted HypoxiaSS deviation maps and timing maps identified spatially coherentabnormalities that were partly complementary and extended beyond conventional MRI-defined lesion margins.Temporal decomposition improves the stability and interpretability of hypoxia-targeted BOLD MRI and provides a practical framework for population-referenced physiological mapping and atlas-based deviation mapping in glioblastoma.

6

The emotional impact of gambling-related advertising: an experimental functional Near-Infrared Spectroscopy study protocol

Daniel, L.-I.; Ros-Leon, A.; Molina-Rodriguez, S.; Pellicer-Porcar, O.; Cabrera-Perona, V.; Ibanez-Ballesteros, J.

2026-05-27 addiction medicine 10.64898/2026.05.20.26353682 medRxiv

Top 1%

0.1%

Show abstract

The proliferation of gambling advertising has intensified concerns regarding its influence on vulnerable populations, yet the neural mechanisms underlying cue-reactivity to these stimuli remain underexplored in ecologically valid settings. This study protocol proposes a novel methodological framework to investigate prefrontal cortical responses to gambling advertisements in individuals with varying degrees of gambling experience. Materials and methods: This cross-sectional study will recruit 44 participants, divided into a clinical group (individuals with high-frequency gambling or gambling disorder) and a matched control group. Neural activity will be recorded using fNIRS while participants view gambling-related, neutral, violent, and sexual stimuli. Secondary measures include validated scales for gambling severity (SOGS), impulsivity, sensation seeking, and alexithymia. Data analysis will primarily utilize inter-subject correlation (ISC) to quantify neural synchronization and multiband frequency decomposition to capture dynamic affective processing. Advanced preprocessing, including short-channel regression, will be applied to ensure signal robustness. Discussion: By combining portable neuroimaging with a data-driven ISC approach, this study aims to identify objective neural markers of gambling vulnerability. The findings will provide novel insights into the idiosyncratic processing of commercial stimuli, potentially informing public health policies and the development of more effective evidence-based regulations for gambling marketing.

7

Associations between serum estradiol and estrone and Alzheimer's disease biomarkers: an analysis in female participants from the European Prevention of Alzheimer's Dementia Longitudinal Cohort Study (EPAD LCS)

Shin, J.; Muniz-Terrera, G.; Ritchie, C.; Manson, J.; Plachecki, S.; Kirschbaum, C.; Gregory, S.

2026-05-30 epidemiology 10.64898/2026.05.27.26354257 medRxiv

Top 1%

0.1%

Show abstract

INTRODUCTION: Postmenopausal estrogen decline may contribute to Alzheimer's disease (AD) risk, but longitudinal evidence linking circulating estrogens to cerebrospinal fluid (CSF) biomarkers is lacking. METHODS: We analyzed 866 female participants from the European Prevention of AD Longitudinal Cohort Study with baseline serum estradiol and estrone measured by liquid chromatography tandem mass spectrometry and repeated CSF measurements of amyloid-beta (A{beta})42, phosphorylated (p) Tau181, and total (t) Tau. RESULTS: Neither estradiol nor estrone was associated with longitudinal A{beta}42. Higher estradiol was associated with lower baseline tau and slower tau increases over time. Baseline estradiol-tau associations were stronger in apolipoprotein E (APOE) {epsilon}4 carriers, though APOE{epsilon}4 did not modify longitudinal associations. Amyloid positivity did not moderate hormone-tau associations but was associated with steeper tau increases over time. Estrone showed no significant associations. DISCUSSION: These findings suggest a more consistent relationship between estradiol and tau-related rather than amyloid-related pathology.

8

Keeping human in the loop: A three-phase generative AI workflow for research integrity in data-intensive science.A methodological case study using elite Ethiopian distance-running data

Galko, P.; Yisamaw, A.; Haugen, T.; Seiler, S.

2026-05-29 sports medicine 10.64898/2026.05.29.26354013 medRxiv

Top 1%

0.1%

Show abstract

Background: Generative AI tools can support data-intensive research by writing code, drafting prose, searching analytical possibilities, and stress-testing claims. They can also produce false citations, drift between statistical specifications, and lose continuity across long investigations. This paper describes a practical workflow for using AI systems in empirical research while keeping discovery, verification, and accountability inspectable. Methods: We developed and applied a three-phase human-AI workflow to a case study of 14 elite Ethiopian distance runners. The dataset contained 22,605 GPS-segments collected across 97 consecutive days in late 2025, supplemented by venue and athlete metadata collected in the field. Phase 1 used an autonomous data-exploration tool to pre-filter the hypothesis space across five seeded research questions. Phase 2 used an AI system under direct human guidance to construct candidate findings into numerical claims, verification scripts, and draft text. Phase 3 used an independent AI system in an adversarial role to stress-test methods, statistics, prose, figures, and citations. The workflow was informed by Pearl's distinction between association, intervention, and counterfactual reasoning, with human judgement retained for research direction, interpretation, and final claims. Results: The workflow produced three empirical analyses and a documented correction process. The analyses estimated an altitude-to-sea-level pace correction of +0.10 min/km per 1,000 m at matched heart rate, showed why pooled altitude-surface regression was not identifiable within this venue system, documented method-dependence in heart-rate-based intensity classification, characterised within-venue route variation as a 64/36 path-fixed-to-trail-variable split with the Sululta label resolving into two functionally distinct sub-venues, and reframed the cohort's training through a 3x3x3 prescription lattice grounded in Ethiopian coaching practice. The adversarial phase identified several hallucinated citations, a terminology error between HC1 and cluster-robust standard errors, and several inconsistencies between prose, figures, and computed results. Verification scripts re-derived nearly all numerical claims from the cleaned lap-level data. Conclusions: The case study shows how researchers can organise AI-assisted empirical work so that candidate discovery, claim construction, independent stress-testing, and final accountability remain separated. The workflow did not remove the need for domain expertise or human judgement. Its value was in making the route from candidate finding to manuscript claim explicit, reproducible, and open to challenge. Trial registration: Not applicable.

9

Choroid plexus calcification detection using quantitative susceptibility mapping MRI

Hett, K.; Dubois, A.; Bonitz, I.; Considine, C. M.; Eaton, J.; Mcknight, C. D.; Claassen, D. O.; Donahue, M. J. J.; Trujillo, P.

2026-05-28 radiology and imaging 10.64898/2026.05.26.26354154 medRxiv

Top 2%

0.1%

Show abstract

Purpose. The choroid plexus (ChP) is the primary source of cerebrospinal fluid and an emerging marker of cerebral health, with enlargement and hypoperfusion reported in aging and neurodegeneration. However, frequent ChP calcifications can confound volumetric and perfusion measures. Although computed tomography (CT) is the gold standard for detecting calcification, it is rarely available in research MRI. Quantitative susceptibility mapping (QSM) offers an alternative sensitive to diamagnetic mineralization but lacks validated susceptibility thresholds. Method. Participants underwent CT and MRI within four weeks, including 3D T1-weighted and a multi-echo gradient echo QSM MRI. ChP calcifications were identified on CT using standard diagnostic criteria. Using the Bayes decision boundary framework, we identified optimal susceptibility thresholds for detecting diamagnetic signals consistent with calcification and compared these thresholds with multiple density levels measured on gold standard CT images. Results. Across all participants (n=20; age=62.2+-12.0 yrs), the optimal susceptibility threshold separating background ChP signal from calcifications was -0.10 ppm at 60 HU (low-density) and -0.15 ppm at 100 HU (high-density). Susceptibility values within calcified tissue exhibited a linear relationship with CT-derived tissue density. A significant positive association was observed between ChP volume and calcification volume among participants with detectable calcification (beta=2.26, p=0.047). Conclusion. This work should provide a practical framework for quantifying ChP calcifications routinely from MRI. The observed relationship between ChP volume and calcification volume highlights the importance of accounting for calcified tissue, particularly when calcification burden is substantial, when investigating ChP abnormalities in aging and neurodegenerative disease.

10

Redefining Extent Of Resection After Meningioma Surgery: a Multicentre Observational Machine Learning Analysis Comparing Simpson, Radiological and Volumetric Grading

Pandit, A. S.; Deehan, M.; Moudgil-Joshi, J.; Reischer, G.; Mathew, S.; Pace, G.; Fatania, G.; Dalton, A.; Nair, R.; Hyare, H.; Mallon, D.; Kitchen, N.; Marcus, H. J.; Nachev, P.

2026-05-27 oncology 10.64898/2026.05.23.26353944 medRxiv

Top 2%

0.0%

Show abstract

Background: Extent of resection remains central to meningioma management, yet Simpson grading is subjective and may not reflect measurable postoperative residual disease. We compared surgeon-reported Simpson grade, report-derived radiological grading, and residual tumour volumetry across a multicentre cohort. Methods: We performed a retrospective study across two tertiary neurosciences centres comprising four hospitals, including patients undergoing primary cranial meningioma resection from 2006 to 2025. Postoperative magnetic resonance imaging (MRI) reports were harmonised using weakly supervised natural language processing based on term frequency-inverse document frequency (TF-IDF) and a linear support vector machine classifier. Residual tumour volume was segmented from contrast-enhanced postoperative MRI and log-transformed. Concordance between Simpson and radiological gross-total/subtotal resection classification was assessed using absolute agreement and prevalence-adjusted bias-adjusted kappa (PABAK). Cox models assessed recurrence-free survival, with bootstrap validation and anatomical and scan-timing sensitivity analyses. Results: Among 912 patients, recurrence or residual progression occurred in 281. Surgical-radiological agreement was substantial but imperfect (absolute agreement 74%; PABAK 0.61), with lower agreement in skull-base and parafalcine-parasagittal tumours. In adjusted models, recurrence hazard increased with Simpson grade (hazard ratio 1.54, 95% confidence interval 1.37-1.72), radiological grade (1.92, 1.68-2.20), and log-transformed residual volume (1.20, 1.16-1.24; all p<0.0005). Optimism corrected concordance increased from Simpson grade to radiological grade and log-volumetry (0.692, 0.733, and 0.748), with this ranking preserved across sensitivity analyses. Conclusions: Imaging-based postoperative residual disease measures outperformed Simpson grade. TF-IDF-assisted report-derived grading provides a scalable bridge to volumetry, while quantitative residual volume offers the strongest prognostic representation.

11

Prevalence of nutritional, behavioral and anthropometric cancer-related risk factors among adults in Nouakchott, Mauritania: a cross-sectional study

Tolba, N.; Najdi, A.; El Hfid, M.; Hmeied Maham, M.; Brahim, S. M.; Tolba, A.; Sellal, N.

2026-05-26 epidemiology 10.64898/2026.05.23.26353924 medRxiv

Top 2%

0.0%

Show abstract

Background Cancer is a growing public health challenge in low- and middle-income countries, where urbanization, nutritional transition and lifestyle changes contribute to modifiable risk factors. In Mauritania, population-based data on cancer-related nutritional, behavioral and anthropometric risk factors remain limited. Objective To describe the frequency of the main nutritional, behavioral and anthropometric cancer-related risk factors among adults living in the three wilayas of Nouakchott. Methods A cross-sectional study was conducted among 1,000 adults aged 18 years and older in Nouakchott. Data were collected using a standardized questionnaire covering sociodemographic characteristics, dietary habits, physical activity and selected health behaviors. Anthropometric measurements were performed to assess body mass index and abdominal adiposity. Abdominal obesity was defined using sex-specific waist circumference cut-off points recommended by the World Health Organization: [≥] 88 cm in women and [≥] 102 cm in men. Results were presented as frequencies and proportions, with comparisons by sex, age group and wilaya of residence. Results Women represented 52.0% of participants, and 53.5% were aged 18-34 years. Excess body weight was frequent, with 38.6% overweight and 28.0% obese. Abdominal adiposity was also common, with 58.0% having increased or substantially increased waist circumference and 48.3% having an elevated waist-to-hip ratio. Physical inactivity was reported by 64.7% of participants, and 15.7% were current smokers. Dietary exposures included high red meat consumption in 66.8%, daily refined cereal intake in 67.5%, daily sugar-sweetened beverage consumption in 14.9%, and limited daily fresh fruit consumption in 13.8%. Significant differences were observed by sex for anthropometric indicators, by age for selected dietary habits, and by wilaya for physical activity, smoking and selected dietary behaviors. Conclusion This study shows a high frequency of modifiable cancer-related risk factors among adults in Nouakchott, particularly excess body weight, abdominal adiposity, physical inactivity and unfavorable dietary habits. These findings support the need to strengthen primary prevention strategies targeting nutrition, physical activity and tobacco control in Mauritania.

12

Mechanism Matters: A Monte Carlo Evaluation of Estimator Validity and Collider Bias in Environmental Mixture Epidemiology

Obeng-Gyasi, E.

2026-05-26 epidemiology 10.64898/2026.05.25.26354044 medRxiv

Top 2%

0.0%

Show abstract

Background: Mixture epidemiology deploys sophisticated estimators, Bayesian kernel machine regression with causal mediation analysis (BKMR-CMA), quantile G-computation (QGC), and parametric G-computation, alongside conventional regression. Comparative evaluations have assumed additive, non-mediated data-generating processes, leaving conditions under which estimator choice determines causal validity uncharacterized. Methods: We developed a simulation framework using military-relevant exposure distributions (metals, per- and polyfluoroalkyl substances [PFAS], polychlorinated biphenyls [PCBs]) and allostatic load (AL) across three deployment tiers, with parameters drawn from military occupational health and contamination literature. Four data-generating processes were specified as directed acyclic graphs: direct effects with confounding (M1), full mediation through AL (M2), synergistic AL-exposure interaction (M3), and collider structure (M4). We evaluated ordinary least squares (OLS), QGC, G-computation, and BKMR-CMA on bias, root mean squared error, and 95% confidence interval coverage across 500 Monte Carlo replications at n = 500 and n = 1,000. Results: No estimator dominated across all mechanisms. Under M1, OLS and G-computation produced near-identical modest positive bias; BKMR-CMA achieved lower root mean squared error through kernel shrinkage. Under M2, BKMR-CMA exhibited severe positive bias for AL (mean bias = +0.579 SD units; coverage = 32.8%). Under M3, BKMR-CMA was the only estimator achieving nominal 95% coverage for AL (95.2%), while regression-based approaches fell to 83.6%. Under M4, G-computation produced persistent bias and near-zero coverage for lead, reflecting structural non-identification. Conclusions: Estimator validity is fundamentally mechanism-dependent. Researchers should base estimator choice on explicit causal assumptions about whether AL functions as confounder, mediator, moderator, or collider, particularly in military and occupational cohorts. We provide a mechanism-to-estimator mapping for applied researchers.

13

DISCERN: A Clinical Impact-aware Framework for Radiology Report Comparison

Sharma, R.; Beeche, C.; Dong, J.; Zhuang, R.; Qu, H.; Zhang, R.; Gangaram, V.; Goswami, P.; Xin, J.; Ballard, J.; Goldberg, A.; Sagreiya, H.; Long, Q.; Chen, T.; Witschey, W. R.

2026-05-27 radiology and imaging 10.64898/2026.05.26.26353612 medRxiv

Top 2%

0.0%

Show abstract

The surge in medical imaging has spurred the development of vision-language models (VLMs) to alleviate radiologist workloads. However, clinical deployment is hindered by the lack of meaningful evaluation frameworks. Current metrics - ranging from semantic similarity to large language model (LLM) based judges - often fail to distinguish between clinically trivial and critical discrepancies, poorly reflecting real-world clinical judgment. To address this, we introduce DISCERN (Discordance and Significance-aware Entity-level Radiology Report Comparison). DISCERN is a significance-aware framework that weighs report errors based on their potential impact on patient care. Our results demonstrate that DISCERN powered by closed source LLMs aligns more closely with expert radiologist assessments than traditional metrics or current LLM evaluators, providing a more interpretable and clinically relevant benchmark. By modeling radiologist prioritization and entity-level feedback, DISCERN facilitates targeted model refinement and ensures the safer integration of generative AI into clinical workflows.

14

Gray Matter Morphological Networks are Associated with Neurobiological Features, Cognitive Status and Clinical Recovery in Traumatic Brain Injury

Sadikov, A.; Cai, L. T.; Xiao, J.; Yuh, E. L.; Choi, H. L.; Sun, X.; Mac Donald, C. L.; Vassar, M. J.; Diaz-Arrastia, R.; Giacino, J. T.; Okonkwo, D. O.; Robertson, C. S.; Stein, M. B.; Temkin, N.; McCrea, M. A.; Jain, S.; Manley, G. T.; Mukherjee, P.; TRACK-TBI Investigators,

2026-05-27 neurology 10.64898/2026.05.25.26354074 medRxiv

Top 2%

0.0%

Show abstract

Generalizable neuroimaging biomarkers that detect cerebral cortical changes after traumatic brain injury (TBI) and predict patient outcomes are needed to improve care and to develop targeted therapies. We used morphometric inverse divergence (MIND) analysis of structural MRI to investigate cortical gray matter morphological networks cross-sectionally and longitudinally after TBI and correlate these with symptoms, disability and cognition six months after injury. Our findings support the Triple Network Model from functional MRI of post-traumatic alterations in the relationship between task-positive, default mode and salience networks. However, the strongest associations between early cortical similarity metrics and long-term patient outcomes involved the dorsal attention network and the limbic network as well as similarity metrics across Mesulam's hierarchy of laminar differentiation. Since MIND mapping of cortical gray matter networks only requires data that is a routine part of standard clinical MRI protocols and does not need image harmonization across different scanners, this work reports a promising new tool that is immediately available for advancing research and clinical care in TBI.

15

Health Literacy and Lifestyle Scores Among A Small but Diverse Group of Older Asian Adults Who Attended Community Health Events in Los Angeles

Zhang, E.; Tran, T.; Shun, K.; Tran, D.; Tsai, A.; Kwang, E.; DerSarkissian, M.; Kuo, T.

2026-05-29 epidemiology 10.64898/2026.05.27.26354181 medRxiv

Top 3%

0.0%

Show abstract

The Asian population in Los Angeles is among the largest and most heterogeneous in the U.S. This is true culturally and health-wise. Older Asians have differing risks for cardiovascular and cardiometabolic disease, depending on their ethnicity, health literacy, and lifestyle choices. This pilot examines several of these factors in a small but diverse group of older Asian adults who attended community health events from 2024-2025. Self-reported and biometric data were collected at five such events hosted by the Asian Pacific Health Corps at UCLA. The pilot generated health literacy and lifestyle (HLL) scores for all participating attendees and explored how they relate to their socio-demographics, healthcare habits, and predictions of their own health data. Overall, there were significantly more females than males with higher HLL scores (p = 0.027). College education (p = 0.028) and "normal" ranges for biometric data (e.g., blood pressure, BMI, blood glucose, cholesterol) were related to higher median HLL scores. With a few exceptions, fewer than 50% accurately predicted their biometric numbers regardless of HLL scores, suggesting a disconnect between perception and reality, and that better provider-patient communication may help foster greater patient understanding about their chronic conditions. These HLL score distributions indicate that educational attainment, better awareness of one's health, and high health literacy are individual factors that may influence older Asians' understanding and potential approach to managing their health conditions.

16

Changes in the profile of adults diagnosed as autistic since 2010: population based studies in England and Sweden

Sadik, A.; Lundberg, M.; Khandaker, G. M.; Pardinas, A. F.; Lee, B. K.; Madley-Dowd, P.; Magnusson, C.; Rai, D.

2026-05-28 epidemiology 10.64898/2026.05.20.26353486 medRxiv

Top 3%

0.0%

Show abstract

Objective: To understand if sociodemographic and neuropsychiatric characteristics of people diagnosed with autism in the United Kingdom (UK) and Sweden have changed since 2010. Design: Cross-context population-based cohort studies. Setting: UK primary care records from 2010-2023 and Swedish population-wide register linkages from 2010-2021 Participants: 24,537,039 individuals age 16 or over, registered with general practices in the UK, including 141,119 with an autism diagnosis. 9,096,874 people age 16 or over in the Swedish Total Population Register, including over 100,817 with an autism diagnosis. Main outcome measures: Annual age-standardised incidence and prevalence of adult autism diagnoses within different sociodemographic groups. Annual age-standardised proportion of adults with new autism diagnoses, lifetime autism diagnoses, and no autism diagnoses, with prior records of other neuropsychiatric conditions or medications. Results: Incident adult autism diagnoses were consistently higher in Sweden than the UK, however incidence increased rapidly in the UK after 2020. Incident diagnoses increased fastest for 16-25-year-olds and females in both nations, as well as people in White ethnic groups in the UK and people with Swedish-born parents in Sweden. For example, in the UK in 2023 the age-standardised incidence of autism diagnoses among 16-65 years olds was 11 diagnoses per 10,000 person-years (95%CI: 10.7, 11.3) in the White ethnic group and 2.2 diagnoses per 10,000 person-years (95%CI: 1.9, 2.5) in the South Asian ethnic group. Over time there has been a consistent decline in the proportion of autistic adults with a prior diagnosis of epilepsy, psychosis and intellectual disability and an increase in the proportion with a prior diagnosis of ADHD, anxiety, depression and several other mental illnesses. For example, in the UK between 2010 and 2023 the age-standardised proportions of newly diagnosed autistic adults with prior records of epilepsy decreased from 10% (95%CI: 7.6, 13) to 4% (95%CI: 3.6, 4.5), while the proportion with records of anxiety increased from 28.7% (95%CI: 24.4, 33.6) to 58.3% (95%CI: 56.6, 60.1). Mental health conditions were generally more common in females and the reduction over time in intellectual disability was greater in females than males. Conclusions: The socio-demographic and neuro-psychiatric characteristics of individuals diagnosed as autistic have changed dramatically since 2010, a phenomenon observed both in the UK and Sweden. The extent to which these changes indicate nuanced recognition of autism or broadening of diagnostic practice needs investigation.

17

High-resolution Orbitofrontal Cortex Morphometry and Cannabis Use Disorder Severity in High-risk Emerging Adults: A Preliminary Study

Hargreaves, T. L.; McIntyre-Wood, C.; Elsayed, M.; Vandehei, E.; Belisario, K. L.; Lee, L.; Blakely, A.; Halladay, J. L.; Amlung, M.; Sweet, L. H.; MacKillop, J.

2026-05-27 addiction medicine 10.64898/2026.05.26.26354113 medRxiv

Top 3%

0.0%

Show abstract

Background: Cannabis use is highly prevalent among emerging adults (18-25 years), a developmental period marked by ongoing neurodevelopment and heightened risk for cannabis use disorder (CUD). Structural alterations in the orbitofrontal cortex (OFC) and medial prefrontal/anterior cingulate cortex (mPFC/ACC) have been linked to cannabis use, though findings remain inconsistent in directionality. To address this, we examined cortical thickness and surface area of the OFC and mPFC/ACC subregions using the high-resolution Glasser atlas, allowing for more granular characterization of associations with CUD severity. Method: One hundred eleven emerging adults (41% male, aged=20.6{+/-}1.1 years) reporting significant alcohol and/or cannabis use completed clinical assessments and structural MRI. The OFC and mPFC/ACC were segmented into seven and six subregions per hemisphere, respectively. Multiple linear regressions tested associations between cortical thickness or surface area and DSM-5 CUD symptom count, controlling for alcohol use and intracranial volume. Subregions surviving false discovery rate correction were examined in relation to depression, trauma-related symptoms, impulsivity, and cannabis use motives. Results: Greater CUD severity was associated with lower cortical surface area and greater cortical thickness in OFC and mPFC/ACC subregions. Lower OFC surface area was correlated with coping- and enhancement-related cannabis use motives. Lower mPFC/ACC surface area and greater thickness were associated with more severe depression, trauma-related symptoms, and impulsivity. Conclusion: In high-risk emerging adults, greater CUD symptom burden is associated with lower surface area and greater thickness in OFC and mPFC/ACC subregions. Using the high-resolution Glasser atlas, these findings provide a more precise characterization of structural correlates of CUD and highlight potential neurobiological markers linked to affective and motivational processes underlying cannabis use.

18

Early Life Determinants of Forward Compression Wave Intensity in Adults

Haynes, A.; Mynard, J. P.; van der Veen, M.; Carson, J.; Green, D. J.

2026-05-27 cardiovascular medicine 10.64898/2026.05.26.26354176 medRxiv

Top 3%

0.0%

Show abstract

Intro: Characteristics of the pulse wave transmitted through the carotid arteries are predictive of cognitive decline and cerebrovascular health in humans. This study aimed to identify risk factor trajectories in childhood, adolescence and early adulthood that are associated with forward compression wave intensity (FCWI) in the common carotid artery in adults aged 28 years. Methods: Systolic blood pressure (SBP), body mass index (BMI) and fasting blood glucose (FBG) measured at multiple time-points when participants were aged between 8-20 years were included in a trajectory analysis. At age 28 years, FCWI was measured in 402 (M=206, F=196) participants who underwent a Duplex ultrasound assessment of the common carotid artery. Statistical analysis assessed differences in FCWI between each trajectory group for males and females separately. Results: In males, four trajectory groups were identified for BMI, three for SBP, and two for FBG. In females, three trajectory groups were identified for BMI, SBP, and FG. In males, having higher BMI (P=0.006), SBP (P=0.021) and FBG (P=0.002) from ages 8-20 years was associated with greater FCWI at age 28 years. In females, no associations were found between FCWI at age 28-years and trajectory groups for BMI (P=0.185), SBP (P=0.289) or FBG (P=0.070). Conclusion: Having high BMI, SBP and FBG throughout childhood, adolescence and early adulthood was associated with higher FCWI in the carotid artery at age 28 years in males, but not females. This may have a direct impact on the etiology of cognitive decline and cerebrovascular disease in later life.

19

ERBB4 deficiency promotes atrial myopathy underlying the atrial fibrillation substrate

Yamaguchi, N.; Santucci, J.; Hong, S. J.; Ferrena, A.; Schlamp, F.; Willett, D.; Casdin, C. J.; Park, P. S.; Lin, X.; Xiao, J.; Hall, S.; Barnard, J.; Achter, J.; Kanhert, K.; Lundby, A.; Chung, M. K.; Van Wagoner, D. R.; Park, D. S.

2026-05-27 cardiovascular medicine 10.64898/2026.05.26.26354173 medRxiv

Top 3%

0.0%

Show abstract

Background Atrial fibrillation (AF) is a leading cause of stroke, cardiovascular morbidity, and mortality. Atrial myopathy, characterized by progressive metabolic, electrical, and structural changes, creates the arrhythmogenic substrate that drives AF. Defining the key drivers of atrial myopathic processes is essential for targeted therapies that can mitigate AF progression. Here we explore how reduced ERBB4 expression contributes to the development of left atrial myopathy. Methods We analyzed the Cleveland Clinic Biobank to compare left atrial ERBB4 levels in patients grouped by AF diagnosis. To investigate the impact of reduced ERBB4 levels on atrial tissue substrate, we created mouse models of cardiac-specific Erbb4 deficiency using Mlc2a (myosin light chain 2a)-Cre. Comprehensive physiological assessments were performed. Transcriptomic analyses of the left atrium were performed in an Erbb4 haploinsufficient mouse model and compared with human atrial datasets. Molecular validation of key dysregulated pathways was performed. Results We found that left atrial ERBB4 levels are reduced in patients with AF. Adult cardiomyocyte-specific Erbb4 heterozygous (Erbb4fl/+;Mlc2a-Cre) mice exhibited prolonged P-wave duration in the absence of ventricular dysfunction. Left atrial transcriptomic analysis in Erbb4 haploinsufficient mice showed upregulation of pathways related to fibrosis, apoptosis, and coagulation, and downregulation of pathways related to fatty acid metabolism and mitochondrial function, mirroring changes observed in pressure overload mouse models. A cross-species transcriptomic comparison revealed significant overlap between ERBB4-correlated gene expression and functional pathways in adult human atria and mice with Erbb4 haploinsufficiency. Validating the transcriptomic data, protein and functional assays demonstrated increased fibrosis, apoptosis, and oxidative stress in the mutant left atrial tissue. Conclusion Left atrial ERBB4 levels are reduced in AF patients. A mouse model of Erbb4 deficiency and human atrial transcriptomic analyses highlight a role for ERBB4 in supporting normal atrial metabolism while protecting against inflammation, apoptosis, and fibrosis.

20

Can Large Language Models Diagnose Primary Immunodeficiency from Patient-Described Symptoms?

Reteig, L. C.; Woloshin, S.; Maglione, P. J.; Farmer, J. R.; Ong, M.-S.

2026-05-27 allergy and immunology 10.64898/2026.05.26.26353818 medRxiv

Top 3%

0.0%

Show abstract

Patients with primary immunodeficiency (PID) often face prolonged diagnostic delays and may increasingly turn to large language models (LLMs) to interpret their symptoms during this period. We evaluated whether an LLM could recognize PID from symptom descriptions derived from interviews with 21 PID patients. In a prior study, we showed that GPT-4o identified PID in 96% of cases when prompted with physician-written patient histories (Rider et al., JACI, 2024). Here, when prompted with symptom descriptions in patients' own words, GPT-5 identified PID in only 7 cases (33%), although it more broadly suggested immune system issues in 18 cases (81%). The gap between these findings indicates that LLMs are sensitive to the language and framing of symptom descriptions, performing substantially worse when patients describe their own symptoms in everyday language than when clinicians summarize patient histories in structured medical terms. This study underscores the need to carefully evaluate how LLMs are used in patient-facing applications.